Code Structure–Guided Transformer for Source Code Summarization
نویسندگان
چکیده
Code summaries help developers comprehend programs and reduce their time to infer the program functionalities during software maintenance. Recent efforts resort deep learning techniques such as sequence-to-sequence models for generating accurate code summaries, among which Transformer-based approaches have achieved promising performance. However, effectively integrating structure information into Transformer is under-explored in this task domain. In article, we propose a novel approach named SG-Trans incorporate structural properties Transformer. Specifically, inject local symbolic (e.g., tokens statements) global syntactic dataflow graph) self-attention module of inductive bias. To further capture hierarchical characteristics code, are designed distribute attention heads lower layers high Extensive evaluation shows superior performance over state-of-the-art approaches. Compared with best-performing baseline, still improves 1.4% 2.0% on two benchmark datasets, respectively, terms METEOR score, metric widely used measuring generation quality.
منابع مشابه
A Convolutional Attention Network for Extreme Summarization of Source Code
Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the model’s attention, but previous attentional architectures are not constructed to learn such features specifically. We introduce an attentional neural network th...
متن کاملAutomated feature discovery via sentence selection and source code summarization
Programs are, in essence, a collection of implemented features. Feature Discovery in software engineering is the task of identifying key functionalities that a program implements. Manual feature discovery can be time-consuming and expensive, leading to automatic feature discovery tools being developed. However, these approaches typically only describe features using lists of keywords, which can...
متن کاملSource code transformations for efficient SIMD code generation
Despite the effort inverted the last years in commercial compilers to generate efficient SIMD instructionsbased code sequences from conventional sequential programs, the small numbers of compilers that can automatically use these instructions achieve in most cases unsatisfactory results. This work shows how exposing register level reuse in source codes helps vectorizing compilers as ICC to gene...
متن کاملSource Code Implications for Malcode
he availability of source code for both exploits and malicious code is higher than it has ever been in the history of computing. More important, the source codes for highly successful, highprofile malicious tools are now available. Several years ago it was almost impossible to obtain the source for these worms and exploits, forcing actors to rely on minor changes to binaries to generate new var...
متن کاملPreprocessing of Object-Oriented Source Code for Code Retrieval
Object oriented source code occurs in diverse programming languages with documentation using miscellaneous standards, comments in individual styles, or associated test cases that are hard to exploit through information retrieval or knowledge discovery techniques. Typically, the information about object-oriented source code for a software system is distributed across several different sources, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Software Engineering and Methodology
سال: 2023
ISSN: ['1049-331X', '1557-7392']
DOI: https://doi.org/10.1145/3522674